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1  ABSTRACT 

/  l 

Given  a  rectangmar  (matrix  A(x)  that  depends  on  the  independent  variables  x,  many  constrained 
optimisation  indthod*  involve  computations  with  Z(x),  a  matrix  whose  columns  form  a  basis  for 
the  null  spare  off  AT(x).  When  A  is  evaluated  at  a  given  point,  it  is  well  known  that  a  suitable 
Z  (satisfying  AtZ  =  0)  can  be  obtained  from  standard  matrix  factorizations.  However,  Coleman 
and  Sorensen  have  recently  shown  that  standard  orthogonal  factorization  methods  may  produce 
orthogonal  bases  that  do  not  vary  continuously  with  x;  they  also  suggest  several  techniques  for 
adapting  these  schemes  so  as  to  ensure  continuity  of  Z  in  the  neighborhood  of  a  given  point. 

This  ^>apcr  is  an  extension  of  an  earlier  note  that  defines  the  procedure  for  computing  Z. 
Here,  we  first  describe  how  Z  can  lx?  obtained  by  updating  an  explicit  QR  factorization  with 
Householder  transformations.  Tin;  properties  of  this  representation  of  Z  with_respcct  to  pertur¬ 
bations  in  A  arc  discussed,  including  explicit  bounds  on  the  change  in  Z.  then  introduce 
regularized  Householder  transformations,  and  show  that  their  use  implies  continuity  of  the  full 
^matrix  Q.  The  convergence  of  Z  and  Q  under  appropriate  assumptions  is  then  proved.  Finally, 
-we  indicat  e^ hy  the  chosen  form  of  Z  is  convenient  in  certain  met  hods  for  nonlinearly  constrained 
optimization.  ^ _ - 
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1.  Introduction 

Given  an  n  x  m  matrix  A  of  rank  m  (m  <  ri),  many  constrained  optimisation  methods  involve 
computations  with  a  (non-unique)  matrix  Z  whose  n  —  m  columns  form  a  basis  for  the  null  space 
of  AT  (i.e.,  such  that  ATZ  =  0).  Typically,  A  represents  the  Jacobian  of  a  set  of  constraints, 
and  the  elements  of  A  are  smooth  functions  of  an  independent  variable  x  (x  6  9?n).  Attention 
luis  recently  been  focussed  on  the  continuity  properties  of  the  associated  Z,  which  turn  out  to 
be  crucial  in  proving  local  convergence  for  certain  methods.  For  example,  in  Coleman  and  Conn 
(1082,  1984),  an  essential  assumption  is  that  small  changes  in  x  lead  to  small  changes  in  Z. 

It  is  well  known  that  certain  factorisations  of  A  provide  stable  and  efficient  means  for  com¬ 
puting  Z.  For  example,  given  the  QR  factorisation  of  A, 

■  *-«(?).  <» 


where  Q  is  an  n  X  n  orthogonal  matrix,  and  R  is  an  m  X  m  non-singular  upper-triangular  matrix, 
Q  may  be  partitioned  as 


m  n  —  m 


(2) 


Coleman  and  Sorensen  (1984)  observed  that  the  standard  method  of  computing  the  QR 
factorisation  through  Householder  matrices  may  not  provide  a  continuous  representation  of  Z{x). 
They  proposed  several  alternative  strategies  for  ensuring  a  continuous  Z,  bast'd  on  removing  the 
discontinuity  associated  with  the  sign  that  defines  each  Householder  transformation.  Gill  ct  a 1. 
(1983)  present  an  updating  teoluiique  that  provides  a  continuous  representation  of  Z.  Byrd  and 
Schnabel  (1984)  note  that  iuhcrcut  discontinuities  exist  if  Z  is  defined  as  a  function  of  x,  except 
ill  certain  special  cases. 

This  paper  is  an  extension  of  Gill  ct  ni.  (1983).  Tin’  matrix  Z  is  a  submatrix  of  an  explicit 
orthogonal  matrix  Q,  which  is  obtained  by  updating  the  QR  factorisation.  In  Section  2,  wo  sum¬ 
marize  the  procedure  for  computing  Z  and  Q.  In  Section  3  we  give  explicit  bouuds  for  the  change 
in  Z  resulting  from  perturbations  in  x,  and  show  that.  Z  is  continuous  in  the  neighborhood  of  a 
point,  where  A  has  full  rank.  We  then  introduce  the  class  of  regularised  Householder  transfor¬ 
mations,  analyse  the  effect  of  perturbations  in  x  on  the  full  matrix  Q,  and  give  a  similar  proof 
of  continuity.  In  Section  4,  we  prove  that  Z  approaches  a  limit  when  'computed  at  a  sequence  of 
points  (xfc)  converging  sufficiently  fast  to  a  suitable  point  x  (and  similarly  for  Q  when  regularized 
Householder  transformations  arc  used).  Numerical  examples  are  given  in  Section  5  to  illustrate 
some  of  the  results.  Finally,  in  81x1100  G  we  discuss  the  chosen  representation  for  Z  in  the  context 
of  algorithms  for  constrained  optimization. 


2.  Representation  and  computation  of  Z 

It  is  essential  to  distinguish  between  the  theoretical  definition  of  Z  as  a  matrix  whose  columns 
have  specified  properties,  ami  its  realization  as  a  data  structure  with  which  computations  arc 


> 
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performed.  Although  a  “matrix”  may  be  used  as  a  pedagogical  convenience  in  describing  an 
(algorithm,  it  will  not  necessarily  be  represented  as  an  explicit  two-dimensional  data  structure 
within  an  implementation.  For  example,  the  standard  Householder  method  for  computing  the 
QR  factorization  (1)  (see,  e.g.,  Stewart,  1973)  results  in  a  special  sequence  of  m  Householder 
transformations  that  arc  stored  in  compact  form  (each  represented  by  a  vector).  Tliis  implicit 
form  of  Q  is  acceptable  in  many  contexts  —  in  particular,  most  optimization  algorithms  do  not 
require  the  elements  of  Z,  but  rather  only  the  ability  to  compute  products  involving  Z  mid  its 
transpose.  With  an  implicit  Q,  operations  with  Z  arc  performed  by  applying  the  sequence  of 
transformations  (not  by  explicit  matrix  multiplication). 

In  contrast,  the  procedure  to  be  described  obtains  Z  from  a  QR  factorization  in  which  Q 
is  stored  explicitly.  We  assume  that  A(x)  is  the  .lacobian  of  a  mixture  of  linear  find  nonlinear 
constraints.  Accordingly,  let  the  m  columns  of  A(x)  be  partitioned  into  two  groups:  the  first  m,. 
columns  (denoted  by  A,.,  and  termed  the  constant  columns)  are  independent  of  x,  and  correspond 
to  the  gradients  of  linear  constraints;  the  last  mN  columns  (denoted  by  AN(x),  and  termed  the 
variable  columns)  vary  with  x,  and  correspond  to  the  gradients  of  nonlinear  constraints.  Thus, 
A  and  R  in  1 1)  have  the  forms 

A  = {A'-  and  *-(t  «„)• 

where  R,  and  RN  are  upper-triangular.  The  factorixation  (1)  of  A  is  assumed  to  be  available, 
with  Q  stored  explicitly.  Note  that 


Now  consider  a  different  matrix  A,  given  by 

A  =  (A*  A„).  (4) 


It  follows  from  (3)  that 


A  =  Q 


where  V  is  (n  -  in,  )  x  tnN.  Thus,  Q  triangulariy.es  the  first  m(,  columns  of  A,  and  the  QR 
factorization  of  A  can  be  obtained  by  Iriangulariziug  V .  In  order  to  perform  this  computation, 
we  apply  mN  updaU's  to  the  QR  factorization  (3)  of  A,,  while  the  columns  of  AN  arc  added  one 
at  a  time.  The  crucial  point  about  this  approach  is  that  the  factorixation  is  updated  after  each 
column  is  added.  Thus,  after  the  t'-th  column  of  AN  has  been  processed,  an  explicit  orthogonal 
matrix  is  available  that  triangularises  columns  1  through  mL  +  i  of  A. 


To  describe  the  work  associated  with  each  update,  we  consider  an  n  x  ( j  -  1)  matrix  C  whose 


QR  factorization  is  given  by 


(5) 
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Let  C  =  (  C  c  )  for  some  vector  c.  Then,  from  (5), 

«-•>-(?  !)■ 


(6) 


We  now  construct  the  Householder  transformation  Hj  that  leaves  t  unaltered  and  annihilates  all 
but  the  first  element  of  w.  If  .  v 

where  u  is  the  j-th  diagonal  clement  of  QTC,  them 

Hj(v)  =  /  -  ^uur, 


where 

i- 1 

ti  =  (oTTTTTo,  i/  +  8ign(i/)j|v||,  vT),  /3=^||u||a.  (7) 

(Unless  otherwise  stated,  j|  ■  ||  demotes  the  Euclidean  vector  norm  or  the  induced  matrix  norm.) 
We  then  have 

C  =  QHj 

where  p  —  -sign(i/)||t;||  and 

Q  =  QHk.  (8) 

The'  following  should  he  noted:  only  columns  j  through  n  of  Q  are  altered  by  Hk\  Hj  dei>ends 
only  on  v  (not  on  <);  and  Hj(<t w)  —  IIj(v),  where*  a  is  any  ne>n-Ze*re>  scalar. 

hi  oreler  l.o  obtain  the  factors  eif  A  (4),  the:  ahem*  proe:e*lurc  is  repented  mN  timers,  beginning 
with  C  =  A,,,  R  =  72,, ,  anel  Q  =  Q\  each  eoliunn  of  AN  then  take's  the  rede  of  c  in  (0).  With  this 
approach,  a  “curremt”  e>rthe>gonal  matrix  (which,  feu  simplicity,  we  shall  elcmoto  by  Q)  is  always 
available  after  e*ach  cedunui  is  adele'd.  Be*causo  each  Iloiise'holder  transformation  is  applied  to 
Q  before  the*  ne*xt  transformation  is  constructed,  Q  n'pr<*sonts  nil  j>re*vie>us  transformat  ions.  By 
applying  Q  to  the  ne*w  colunin  as  the*  first  step  in  e*arli  upelate*,  the*  e*ire*e  t  is  to  apply  the  emtirc 
se*epie*ne*e*  of  IIousehol<1<*r  transformations.  Therefore*,  each  Iloiise'holder  matrix  is  lied.  applie*el  to 
the  remaining  (untrausfeuiue'd)  columns  of  AN,  in  contrast,  to  the  standard  lle>usehedde*r  procediuc. 

Cennpletiem  of  a  single  update*  invedves  three*  st.e*ps:  (i)  formatiem  e»f  QTc  to  obtain  v,  (ii)  elefi- 
uition  of  u,  anel  (iii)  application  of  the  Houseliedeler  transformation  to  Q.  The  dcsireel  factorization 
of  A  re'epiires  rnN  upelates,  anel  the  fund  Q  satisfies 


gT-  Hm 


•• HiQ 


T 


(9) 


The  teital  work  require*!  to  obtain  Q  and  ft  is  of  the  e»rele*r  e>f  2 nmN(n  -  mL)  +  nmw(n  -  mN) 
operations. 
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3.  Perturbation  analysis  of  Z  and  Q 

In  this  wet  ion  we  analyze  the  effect  of  the  procedure  of  Section  2  when  applied  at  points  “near” 
a  point  x  where  A(x)  has  full  rank.  Rougldy  speaking,  the  desired  continuity  properties  involve 
showing  that  small  changes  in  x  lead  to  small  changes  in  Z.  The  proof  is  complicated  by  the  fact 
that  the  class  of  Householder  transformations  does  not  include  the  identity  matrix  or  any  matrix 
near  it.  Therefore,  considering  (9),  the  full  matrix  Q  is  not  continuous.  However,  it  turns  out 
that  small  changes  in  x  do  lead  to  small  changes  in  the  columns  of  Z.  Explicit  bounds  on  the 
perturbation  in  Z  are  derived  in  Section  3.1. 

Although  this  result  is  satisfactory  for  methods  in  which  only  Z  is  required,  continuity  of  all 
of  Q  is  useful  in  other  contexts  —  for  example,  when  an  explicit  representation  of  the  range  space 
is  used  in  an  update.  To  extend  the  continuity  result  to  all  of  Q,  in  Section  3.2  we  introduce 
the  class  of  regularized  Householder  matrices  (which  does  contain  the  identity),  and  show  that 
bounds  similar  to  those  for  Z  in  the  standard  case  can  be  obtained  for  all  of  Q  when  updates  are 
performed  with  regularized  Householder  transformations. 

3.1.  Perturbation  in  Z.  For  simplicity,  in  this  section  we  assume  that  mL  =  0,  i.e.,  that  all 
coluuuis  of  A  are  variable;  the  analysis  can  be  applied  in  a  straightforward  manner  when  constant 
colunms  arc  present.  Given  any  ex  >  0  and  the  associated  neighborhood  of  points  x  f  Sx,  where 

INI  <  «*>  (10) 

we  analyze  the  computation  of  Z(x  +  Sx)  from  Z(x)  using  the  procedure  of  Section  2. 

Lot  A  denote  A(x),  and  A  denote  A(x  +  Sx),  with  a  similar  convention  for  Q  aud  Z.  The 
QR  factorization  (1)  of  A  is  assumed  to  be  given.  Since  A  is  a  twicc-continuously  differentiable 
function  of  i,  given  .any  t  >  0,  there  exists  <x  such  that  (10)  implies 

A  =  A  +  SA,  where  ||M||  <  e.  (11) 

The  existing  QT  “almost”  triangularises  A,  i.e., 

QTA  =  R=(*')+E,  (12) 

where  ||  R\\  <  r .  In  order  to  triaugularize  R,  the  procedure  of  Section  2  constructs  a  special 
sequence  of  rri  Householder  transformations  {IIi,...  //„}•  Thus,  we  have 

A  =  Q  (p),  with  QT=  (13) 

The  jth  transformation  Hj  is  construct'd  so  that  its  application  to  a  vector  v  docs  not  alter 

components  1  through  j  -  1,  and  annihilates  components  j  +  1  through  n. 

The  matrix  Z  corresponding  to  A  comprises*  the  last  n  —  m  columns  of  Q.  To  examine  the 
changes  in  this  part  of  Q,  we  introduce  a  sequence  of  diagonal  matrices  {Dj},  j  —  1, . . . ,  m,  with 


i  n-i 


■.  *.  • 


'<■  _*•  **  ' 
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Then,  from  (2), 


Since 


QDm  =  (0  Z). 
QT-QT=(Hm--H1-I)QT 


and  ||Q||  =  1,  we  obtain  the  following  bound  for  the  change  in  Z: 


\\Z-Z\\<\\Dm(I-Hm^Hi)\\. 


(14) 


To  derive  a  bound  for  the  right-hand  side  of  (14),  observe  that  the  special  structure  of  H} 
implies  that 

DjHi  =  0,0,0,_l  (15) 

Using  (15),  the  fact  that  |j0,-/7,-||  <  1  and  the  identity 

Dj(I  -Hr-  Hi)  =  Dj(I  -  Hj)  +  DjHi  (I  -  0,_t  •  •  •  Hi), 

wc  obtain 

110,(7  -Hi--- 7MH  =  ||0,(7  -  Hi)  +  0,0,0,  _x(7  -  0,-!  ..0011 

<  ||0y(/  -  0,)||  +  110,  -0/  ~  «i- 1  •  •  •  *011-  (16) 

Therefore,  if  we  develop  a  positive  sequence  {ij,  }  such  that 


»/0  =  0  and  q,  >  q,  x  +  ||0,(7  -  77,)||,  ;sl . . 


(17) 


it  follows  from  (14)  and  (16)  that 

\\Z  ~  Z\\  <  »7m.  (18) 

The  quantity  needed  to  define  {q,}  is  an  upper  bound  on  ||0,(7  -  0,)||,  which  we  shall 
obtain  by  examining  the  structure  of  the  j-tli  Householder  matrix  //,  .  hi  order  to  simplify  this 
process,  we  prove  the  following  lemma,  which  shows  that  the  sequence  of  Householder  matrices 
h*d  to  triangularixe  a  given  matrix  is  unaffected  l>y  postmultipliratiou  by  au  upper-triangular 
matrix. 


Lemma  1.  Let 

H„-SM=(o) 

represtnU  the  rt'ductiou  to  triangular  form  of  the  htll-raitk  matrix  A  h y  Householder  transforma¬ 
tions  as  describe*!  in  Section  2.  Let  S  he  a  nonsingular  upper-triangular  matrix,  and  let  A'  =  AS. 
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represents  the  triangukirixation  of  A'  by  Householder  transformations,  then  . 

Hi  =  H[,  *  =  1,  • . . ,  m. 

Proof.  The  proof  is  by  induction.  Let  denote  the  t'th  column  of  A,  and  similarly  for  aj  and  A'. 

To  begin  the  induction,  note  that  since  a\  is  a  nonzero  multiple  of  cii,  it  follows  that  Hi  =  //{. 

Then  suppose,  inductively,  that  Hi  =  if,-,  *  =  1, . . . ,  k  -  1.  At  the  fcth  step,  II k  is  determined 
by  the  last  n  —  k  +  1  components  of  Hk- l  •  •  •  Hiak,  and  likewise  II'k  is  determined  by  the  last 
n  -  k  +  1  components  of  H'k_L  •••  H[a'k.  By  definition  of  A'  and  our  inductive  hypothesis,  we 
have 

H'k-i  "* H[a'k  =  H'k_l“'H[(sk,k(tk  +  Sfc-  i,fcOfc-i  H - H  ai.feOi) 

=  Hk-t  •  ■  •  H i(sktkak  +  «fc  H - 1-  si.feOi). 

By  construction  of  Hk-i,-  •  • , Hi,  the  last  —  k  +  1  components  of  Hk-  i  •  ■  •  *  <  k,  are 

zero.  Therefore,  the  last  n  -  k  +  1  components  of  H'k_  x  •  •  •  H[a'k  .arc  a  nonzero  multiple  of  the 
last  n  -  k  -f  1  components  of  Hk  -i  •  •  •  Ihak,  and  it  follows  that  H'k  =  Hk-  I 

From  (12),  the  matrix  to  be  triangularized  is 


where 


(q)  +e  =  wr, 

=  ^Q+A  and  A  =  ERl. 


Because  of  Lemma  1,  the  set  of  transfonuations  that  triangularize  R  are  the  same  as  those  that 
triangularizc  W  (a  perturbation  of  the  identity). 

Let  Wj  denote  the  matrix  to  be  reduced  at  the  j-th  step,  i.c., 

I VjSHi-i  —  HiW, 

when*  the  first,  j  -  1  columns  of  Wj  are  already  in  upper-triangular  form.  Let  tbj  denote  the  j-tli 
column  of  W},  let  t}  denote  the  first  ;  -  1  components  of  w},  wy  its  j-t.li  component,  and  wy  its 
hast  n  j  components,  so  that 

»T=((T„J)  *1,1,  vj=  (•'/.»/)•  (20) 

Thus,  Vj  is  the  vector  to  be  reduced  at  step  j.  From  (7),  the  Householder  vector  tiy  is  defined  by 


uj=  (0,. . . ,0,  Vj  +  i*>Ru(t'y)||vl-|| ,  vj). 


Note  that 


V2IMI  <  IK  I  <  *M- 


-■ lv  t .■  y r.  rrrrrT!  '.t  ».-  ■-■  ».■ v  r-  v 


'  _  -  -  .  •  .  -  .  ~ 
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Using  norm  inequalities  and  (22),  we  obtain 

m‘ " H,)"  “  nwr-  5  nrar  - 2  im  s  ^  m  ■ 


(23) 


It  follows  immediately  that  the  perturbation  in  components  j  +  1  through  n  of  any  vector  w 
after  application  of  Hj  satisfies: 


||D,(oi  -  H,»)||  <  \/2||ib||H.  (24) 

We  now  develop  an  upper  bound  for  ||t>y||  and  a  lower  bound  for  ||vy||.  Let  8j  denote  the 
norm  of  the  ji-tli  column  of  A  (which,  from  (19),  is  also  a  bound  on  the  norm  of  the  vector  of 
subdiagonnl  elements  in  the  y-th  column  of  W).  Because  Householder  matrices  arc  orthogonal, 
the  jr'-th  column  Wj  of  Wj  satisfies 


l-6j<  ||wj  <  1  +  Sj.  (25) 

To  obtain  a  lower  bound  for  ||vy||,  we  repeatedly  apply  (24)  and  (25)  to  bound  the  perturba¬ 
tion  in  components  j  through  n  of  colunui  j  of  Wj.  Formally,  let  the  set  of  positive  values  {fyj}, 
j  =  1,. . .  ,m  be  defined  as  follows: 


1-Sj 

to  ~  * 


\/2(l  +  6j) 

to  ’ 


*  =  j  -1. 


Then  Vj,  the  vector  to  be  reduced  at  the  j-th  step,  satisfies 


IK  II  >  to-  (27) 

(We  assume  that  ||A||  is  sufficiently  small  so  that  to  remains  positive.) 

Since  r>j  corresponds  to  the  subdiagonal  elements  in  colunui  j  of  W  j,  we  define  the  sequence 
{Mj'j  }i  J  =  1, . . . ,  m,  as 


/*i,i  =  Sj 

,  ,  \/2(n -Sj)- 

Mj,i  1 1  —  M/,»  +  £,  f  »  *  —  1,  !• 


It  follows  from  (24)  ami  (25)  that 

IK  II  ^  /**»• 

Therefore,  using  (23),  (27)  and  (29),  we  have 


(28) 


(29) 


II Dj(I  -  Hj)\\  <  sfi 


to 


(30) 


a 
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Applying  (30)  to  (17)  and  (18),  we  obtain  the  following  bound: 


\\Z  -  Z\\  <  n/2  (  + 

Cl.l 


Pm,r 

£m,n 


(31) 


This  expression  is  intractable  as  it  stands.  However,  when  ||A||  is  small,  we  can  obtain  a 
simple  expression  for  the  upper  bound  in  (31).  First,  note  that  if  Sj  <C  1,  then  (26)  implies  that 
£jj  is  of  order  unity  for  all  j.  It  follows  from  (28)  that  the  growth  in  fijj  is  approximately  linear, 
i.e., 

j 

<  ^^2  6j- 
«•= 1 

Let 

6  =  ||^721||,  (32) 


so  that  Sj  <  6  for  all  j;  then 


Pjj  <  jV2S. 


Substituting  in  the  bound  on  \\Z  -  Z\\  in  (31),  we  obtain 


\\Z  ~  Z\\<^--m(m  +  l)6.  (33) 

A 

Let,  ot  denote  the  smallest  singular  value  of  A.  Since  ||i2-1||  <  a-1  and  6  <  ||L||||/2~1||,  it 
follows  from  (33)  that 

\\Z-Z\\<pa  l\\E\\,  (34) 

wliere  /»  depends  only  on  m.  As  in  standard  error  analysis,  we  highlight  the  dependence  of  the 
bound  on  the  condition  number  of  A  by  rewriting  (34)  as 

\\Z  —  Z\\  <  pcond(A) 

wh<*re  cotul(A)  is  the  ratio  of  the  Largest  to  the  smallest  singular  values  of  A. 

The  proof  of  uniform  continuity  of  Z  at  x  is  almost  immediate.  First,  note  that  p  and  «  are 
independent,  of  S x.  Stroud,  recall  that  c  — »  0  as  t  x  — *  0.  It,  follows  directly  from  (34)  that 


lim  Z  =  Z. 

II  f  x  II  -.0 

The  hound  (34)  is  interesting  because,  although  Z  is  not  unique,  the  null  space  itself  (denotet 
by  fi)  is.  Let  Q  denote  the  null  space  of  AT  from  (11).  If  we  measure  the  distance  between  Q  ant 
Q  by  the  norm  of  the  difference  of  the  projectors  onto  them,  then  Q  .ami  Q  differ  by  a  quantit 
that  is  asymptotically  bounded  by  «  1 1| JE7)|.  In  Davis  and  Kalian  (1970),  it  is  shown  that  thcr 
exists  a  rotation  P  such  that  PQ  =  U,  and  |j/  -  P|)  is  minimal  (in  this  case,  approximatel 
a  ‘]|E||).  Thus,  the  choice  Z  -■  PZ  would  provide  the  “best”  algorithm  for  updating  Z.  Tl 
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bound  (34)  is  larger  by  a  factor  of  order  m2  than  the  bound  corresponding  to  the  optimal  choice  of 
rotation.  However,  for  some  matrices  E,  a-1||l?||  may  be  a  substantial  overestimate  of  ||.E/2~ 1 1|, 
which  may  in  turn  be  a  substantial  overestimate  of  6j  for  some  j.  This  will  be  illustrated  by 
Example  2  in  Section  5. 

3.2.  Regularized  Householder  transformations;  perturbation  in  Q.  Although  the  matrix 
Z  obtained  using  ordinary  Householder  matrices  undergoes  small  perturbations  in  a  neighborhood 
of  x,  the  same  does  not  hold  for  Y  (the  first  m  columns  of  Q).  In  fact,  the  effect  of  applying 
each  set  of  transformations  { Hj }  is  to  change  the  signs  of  the  columns  of  Y .  Thus,  no  bound 
analogous  to  (34)  can  be  obtained  for  Y .  However,  the  difficulty  can  be  circumvented  by  defining 
the  regularized  Householder  transformation  Hj  to  be 

i -i 

Hj  =  DjHj,  where  D} :  =  diag(1^7l,  -1,  1, . .  • ,  1),  (35) 

i.e.,  Hj  is  Hj  with  the  sign  of  its  j-th  row  reversed. 

In  this  section,  we  derive  a  bound  on  ||Q  -  Q||  where  Q  is  obtained  from  Q  by  the  procedure 
of  Section  2,  but  using  regularized  rather  than  standard  Householder  transformations.  Because 
the  derivation  of  the  bound  is  so  similar  to  that  for  \\Z  -  Z ||,  we  simply  highlight  the  major 
differences. 

The  relationship  analogous  to  (1C)  for  regularized  Householder  tnuisformations  is 

||7  -  Hj ...  ^11  <  ||I  -  If  j  ||  +  ||/  -  Hj- 1  •  •  •  /fi||. 

Hence,  if  we  derive  a  sequence  {rjj}  such  that 

Vo  =  0,  rjj  >Vji  +  II /  -  ///II, 


then 

IK}  -  G||  < » im. 

Lemma  1  applic-s  to  regularized  Householder  matrices,  so  that  we  need  to  consider  only  matrices 
of  the  form  W  in  (19). 

The  critical  quantity  to  be  determined  is  a  bound  on  ||f  —  II j\\-  To  illustrate  the  process, 
consider  /  II  i  /  -  //|/I|.  Using  (20)  and  (21),  v  will  denote  the  v«vtor  to  be  reduced,  and 
the  corresponding  Householder  vector  u  is  given  by 

u  _  ^  +  8ign(i/)||w||^ 


«l«n 


«2«n 
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The  Frobenius  norm  of  this  matrix  may  be  obtained  by  direct  computation  and  bounded  using 
(22),  giving 

Since  ||  •  ||j  <  ||  •  ||p,  the  following  lemma  follows  immediately. 

Lemma  2.  Let  Hj  be  the  regularized  Householder  transformation  defined  by  (35),  (20)  and  (21); 
then 

||/-^||<2M  (36) 

Exactly  as  for  (24)  through  (31),  we  can  then  derive 

||G-GI|<2(^  +  ...  +  ^). 

(1,1  (m,m 

(Note  that  this  differs  from  (31)  only  in  the  constant  multiplying  the  right-hand  side.)  When 
||A||  is  small,  we  have  the  same  form  of  bound  as  in  (34),  namely 

HQ-g||<P«-1m 

where  p  depends  only  on  m.  Continuity  of  Q  at  x  follows  exactly  as  for  Z. 

4.  Convergence  of  Z  and  Q 

One  reason  for  interest  in  the  continuity  of  Z  is  in  proving  local  convergence  results  for  nonlin- . 
early  constrained  optimization  methods  that  maintain  estimates  of  the  projected  Hessian  of  the  ' 
Lngrangian  function  (e.g.,  Coleman  and  Conn,  1984a,  b).  Hence,  we  now  turn  to  the  computation 
of  Z  within  <ui  iterative  method  that  generates  a  sequence  {i*},  where  Qk  \  i  is  computed  from 
Qk  using  the  procedure  of  Section  2. 

We  iissumc  that  {i*}  converges  to  a  point  x  such  that  A(x)  lias  full  rank.  Thus,  there 
exists  an  integer  Ki  such  that  for  all  k  >  Ki,  A(ik)  has  full  rank;  we  shall  consider  only  such 
values  of  k.  We  further  assume  that 

H  \\*k  -  X*||  <  +°0. 

k  0 

This  implies  that  for  any  c  >  0,  then?  exists  an  integer  K  such  that  for  all  (  >  k  >  K  >  Ki, 

IN  -  +  •  •  •  +  IN+1  -  *fc!l  <  <•  (37) 

For  a  given  value  of  c  in  (37),  we  shall  consider  only  values  of  the  iteration  count  that  exceed  the 
associated  K. 

0 

The  bound  (34)  shows  that,  for  sufficiently  large  K,  there  exists  a  positive  constant  Af, 
independent  ofk,  such  that  for  all  k  >  K, 
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Therefore,  we  have  for  all  /  >  k  >  If, 

II Zt  -  Zk II  <  \\Zt  -Z,-t II  + . . .  +  \\Zk+1  -  *fc|| 

<  Af{||a:/  -  Xf-i||  +  •  •  •  +  ||a?/fc+i  -  */fc||).  (38) 

Because  of  (37),  the  sum  on  the  right  in  (38)  can  be  made  as  small  as  desired  by  appropriate 
choice  of  if,  and  hence  \\Zi  —  Zk ||  can  be  made  as  small  as  desired.  Thus  {Zk}  is  a  Cauchy 
sequence,  and  therefore  converges  to  a  limit  Z*  as  {a:*}  converges  to  x  .  We  emphasise  that  the 
limit  Z*  depends  on  the  sequence  (it). 

If  regularized  Householder  matrices  are  used  to  define  Qk+i  from  Qk,  exactly  the  same  result 
holds  for  the  full  matrix  Q. 


5.  Numerical  examples 

In  this  section,  we  illustrate  some  properties  of  the  method  with  two  simple  examples. 

Example  1.  Let  x0  =  (1,0, 1)T  and  x  =  (0, 1, 2)T.  We  define  the  function  a(x)  =  x,  and 
consider  the  following  sequence,  which  begins  at  x0  and  converges  to  x  so  as  to  satisfy  (37): 


(-1/2)* 

xfc  =  |  1  -  (-1/2)* 
2  -  1/2* 


With  =  the  matrix  Qa  is  the  Householder  matrix 


-.70711  0  -.70711 
Qo  =  |  0  1  0 

-.70711  0  .70711 


For  each  k,  Zk  is  the  hist  two  columns  of  Qk. 

All  computation  was  performed  using  double-precision  arithmetic  on  an  IBM  3081,  corre¬ 
sponding  to  about  1G  decimal  digits  of  precision.  All  numbers  shown  are  rouudtxl  to  five  figures. 
At  steps  10  and  11,  where  ||xjj  -  Zj(l  j|  —  2.1284  X  10  3,  we  have 


f  -.000437 

-.28042 

-.95988  \ 

9 10  = 

-.89451 

.85868 

-.25066  I  , 

v  -.44704 

-.42900 

.12574  ) 

'  -.000218 

-.28042 

-.95988  \ 

Qn  = 

.89430 

.85839 

-.25088  I  , 

^  .44748 

-.42958 

.12530  / 

\\Zu  -  Zio\\p  =  8.1736  x  10  4.  Note  the  change 
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A  sequence  of  orthogonal  matrices  was  similarly  generated  for  the  following  sequence  {yt}, 
which  also  begins  at  Xq  and  converges  to  x  : 


Vk  = 


(  1/2* 

1  -  1/2* 

2-  1/2* 


Note  that  Q(yo)  =  Q(x o).  However,  all  subsequent  matrices  differ  for  the  two  sequences. 
In  both  cases,  the  Z  matrices  converge,  but  to  different  limits: 


lira  Z(xk)  = 

k—*oo 


f  .28042 

-.95988 

.85854 

-.25082 

^ -.42927 

.12541 

/  -.19371 

-.98106 

.87749 

-.17326 

V  -.43874 

.08663 

The  first  column  of  the  limiting  matrix  Q{x)  is  ±(0,  .89443,  .44721)r. 

For  comparison,  the  Q  matrix  that  would  result  from  applying  a  standard  Householder 
reduction  at  x  is  given  by 

(  0  -.89443  -44721  \ 

-.89443  -.4  .8  1  . 

-.44721  .8  -.4  ) 


Example  2.  The  second  example  shows  how  the  relationship  between  E  and  R  in  (19)  can 
affect  the  actual  change  in  Z,  although  the  bound  (34)  remains  unchanged.  Let  A  be  6  X  3,  with 
R  given  by 

/iO7  1  1\ 

n  10  ;J- 

The  smallest  singular  value  of  R  is  of  order  unity.  Consider  a  matrix  E  such  that  \\E\\  =  0(1); 
then  (34)  implies  that  the  change  in  Z  can  be  of  order  unity. 

This  bound  is  achieved,  for  example,  if  E  is  given  by 


10  7 

IO"7 

10  7 

io-7 

E  = 


since  ||JSii  *11  is  of  order  unity,  i.c.,  similar  in  magnitude  to  ||i?||||/?  1 1|.  However,  if 

/ 1  l(r7  10  7 


E  = 


Vi  io-7  io-7 
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then  the  perturbation  in  Z  is  of  order  10  7  (much  less  than  the  bound)  because  \\ER  1|j  = 

o(io-7) «  ||JS||||Jr1||. 

6.  Comparison  with  alternative  procedures 

In  this  section,  we  compare  the  procedure  of  Section  2  with  an  alternative  technique  for  obtaining 
an  explicit  matrix  Z.  We  emphasize  that  an  explicit  Q  is  required  in  the  most  popular  algorithms 
today  for  solving  constrained  problems  with  inequality  constraints.  An  implicit  Q  is  suitable 
if  A  changes  only  by  the  addition  of  columns.  However,  any  other  change  to  A  can  be  made 
efficiently  only  by  access  to  an  explicit  Q.  Inequality  constraints  are  most  often  treated  by  posing 
a  quadratic  programming  (QP)  subproblem  with  linearized  (inequality)  versions  of  the  original 
constraints  (see  Powell,  1983,  for  a  survey  of  sequential  quadratic  programming  methods).  The 
QP  is  solved  by  developing  a  working  set  A  that  undergoes  the  addition  and  deletion  of  columns 
until  it  becomes  the  active  set  of  the  QP.  Furthermore,  if  simple  bound  constraints  are  treated 
separately  from  general  linear  constraints,  the  matrix  A  is  also  subject  to  the  addition  and/or 
deletion  of  rows  (see  Gill  et  aJ.,  1984a,  for  details  of  the  update  procedures). 

The  most  obvious  alternative  to  the  method  of  Section  2  is  to  apply  a  standard  Householder 
procedure  in  which  the  Householder  vectors  are  stored  in  compact  form  diuring  the  triangular- 
ization;  we  shall  refer  to  this  as  the  implicit  procedure.  Assuming  that  the  mL  transformations 
corresponding  to  constant  columns  of  A  are  retained,  the  matrices  R  and  Q  of  Section  2  can  be 
computed  using  the  standard  Householder  procedure  in  mN(2nm,.  —  m\)  operations  to  apply  the 
m,.  fixed  transformations  to  AN,  and  -j-m’  +  m*  (n  -  m)  operations  to  produce  the  desired  tri¬ 
angular  form.  The  explicit  matrix  Q  is  then  formed  by  multiplying  the  transformations  together 
in  reverse  order,  which  require  2 nm(n  -  rn)  +  |m*  operations. 

When  no  linear  constraint's  are  present  (mL  =  0),  the  implicit  procedure  requires  less  storage 
and  work  than  the  explicit  procedure.  However,  as  the  proportion  of  linear  constraints  increases, 
the  explicit  procedure  eventually  requires  less  work  (in  effect,  because  the  implicit  procedure  must 
repeatedly  multiply  together  the  Householder  transformations  corresponding  to  linear  constraints 
in  order  to  obtaiu  the  explicit  matrix  Q).  We  stress  this  point  because  many  optimization 
problems  contain  a  significant  proportion  of  linear  constraints.  Although  it  is  simpler  to  treat  all 
constraints  as  nonlinear  for  expository  purposes  (ns  we  have  done  in  Section  3),  their  existence 
should  be  considered  when  analyzing  the  work  associated  with  a  practical  algorithm. 

To  summarize,  the  procedure  of  Section  2  ensures  the  continuity  properties  of  Z  needed  in 
many  constrained  optimization  algorithms,  and  can  easily  be  extended  to  imply  continuity  of  Q. 
Furthermore,  its  cost  is  comparable  to  (or  even  less  than)  that  of  the  implicit  procedure  when 
the  problem  contains  a  significant  proportion  of  linear  constraints. 
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SOL  85-1:  PROPERTIES  OP  A  REPRESENTATION  OP  A  BASIS  TOR  THE  MILL  SPACE 
by  Philip  E.  Gill,  Halter  Murray,  Michael  A.  Saunders,  G.  W. 
Stewart  and  Margaret  H.  Wright. 

Given  a  rectangular  matrix  A(x)  that  depends  on  the  independent 

variables  x,  many  constrained  optimization  methods  involve  computations 

with  Z(x),  a  matrix  whose  columns  form  a  basis  for  the  null  space  of 

AT(x).  When  A  is  evaluated  at  a  given  point,  it  is  well  known  that  a 

T 

suitable  Z  (satisfying  A  Z  ■  0)  can  be  obtained  from  standard  matrix 
factorizations.  However,  Coleman  and  Sorensen  have  recently  shown  that 
standard  orthogonal  factorization  methods  may  produce  orthogonal  bases  that 
do  not  vary  continuously  with  x;  they  also  suggest  several  techniques  for 
adapting  these  schemes  so  as  to  ensure  continuity  of  Z  in  the  neighbor¬ 
hood  of  a  given  point. 

This  paper  is  an  extension  of  an  earlier  note  that  defines  the  proce¬ 
dure  for  computing  Z.  Here,  we  first  describe  how  Z  can  be  obtained  by 
updating  an  explicit  QR  factorization  with  Householder  transformations. 
The  properties  of  this  representation  of  Z  with  respect  to  perturbations 
in  A  are  discussed,  including  explicit  bounds  on  the  change  in  Z.  We 
then  introduce  regularized  Householder  transformations,  and  show  that  their 
use  implies  continuity  of  the  full  matrix  Q.  The  convergence  of  Z  and 
Q  under  appropriate  assumptions  is  then  proved.  Finally,  we  indicate  why 
the  chosen  form  of  Z  is  convenient  in  certain  methods  for  nonlinearly 
constrained  optimization. 
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